Overview

Brought to you by YData

Dataset statistics

Number of variables 22
Number of observations 16941
Missing cells 21
Missing cells (%) < 0.1%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 2.8 MiB
Average record size in memory 176.0 B

Variable types

Categorical 10
Numeric 9
Text 3

Alerts

meddra_concept_id_3 is highly overall correlated with meddra_concept_name_4 and 6 other fields High correlation
meddra_concept_id_4 is highly overall correlated with meddra_concept_name_4 and 3 other fields High correlation
meddra_concept_code_4 is highly overall correlated with meddra_concept_name_4 and 5 other fields High correlation
meddra_concept_id_2 is highly overall correlated with meddra_concept_name_4 and 5 other fields High correlation
meddra_concept_name_4 is highly overall correlated with meddra_concept_class_id_1 and 7 other fields High correlation
meddra_concept_id is highly overall correlated with meddra_concept_name_4 and 3 other fields High correlation
meddra_concept_class_id_1 is highly overall correlated with meddra_concept_class_id_2 and 6 other fields High correlation
meddra_concept_class_id_2 is highly overall correlated with meddra_concept_class_id_1 and 6 other fields High correlation
meddra_concept_class_id_3 is highly overall correlated with meddra_concept_class_id_1 and 6 other fields High correlation
meddra_concept_class_id_4 is highly overall correlated with meddra_concept_class_id_1 and 6 other fields High correlation
meddra_concept_code_1 is highly overall correlated with meddra_concept_id High correlation
meddra_concept_code_2 is highly overall correlated with meddra_concept_name_4 and 2 other fields High correlation
meddra_concept_code_3 is highly overall correlated with meddra_concept_name_4 and 4 other fields High correlation
relationship_id_12 is highly overall correlated with meddra_concept_class_id_1 and 6 other fields High correlation
relationship_id_23 is highly overall correlated with meddra_concept_class_id_1 and 6 other fields High correlation
relationship_id_34 is highly overall correlated with meddra_concept_class_id_1 and 6 other fields High correlation
soc_category is highly overall correlated with meddra_concept_name_4 High correlation
meddra_concept_class_id_1 is highly imbalanced (99.8%) Imbalance
meddra_concept_class_id_2 is highly imbalanced (99.8%) Imbalance
meddra_concept_class_id_3 is highly imbalanced (99.8%) Imbalance
meddra_concept_class_id_4 is highly imbalanced (99.8%) Imbalance
relationship_id_12 is highly imbalanced (99.8%) Imbalance
relationship_id_23 is highly imbalanced (99.8%) Imbalance
relationship_id_34 is highly imbalanced (99.8%) Imbalance
pediatric_adverse_event is highly imbalanced (74.6%) Imbalance

Reproduction

Analysis started 2025-04-28 13:39:30.188562
Analysis finished 2025-04-28 13:39:44.498958
Duration 14.31 seconds
Software version ydata-profiling vv4.16.1
Download configuration config.json

Variables

meddra_concept_name_4
Categorical

High correlation 

Distinct 28
Distinct (%) 0.2%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
Investigations
1648 
Injury, poisoning and procedural complications
1240 
Nervous system disorders
1199 
Infections and infestations
1147 
Vascular disorders
 
967
Other values (23)
10740 

Length

Max length 67
Median length 40
Mean length 32.018712
Min length 3

Characters and Unicode

Total characters 542429
Distinct characters 38
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row nan
2nd row nan
3rd row nan
4th row Blood and lymphatic system disorders
5th row Blood and lymphatic system disorders

Common Values

Value Count Frequency (%)
Investigations 1648
 
9.7%
Injury, poisoning and procedural complications 1240
 
7.3%
Nervous system disorders 1199
 
7.1%
Infections and infestations 1147
 
6.8%
Vascular disorders 967
 
5.7%
Gastrointestinal disorders 948
 
5.6%
Skin and subcutaneous tissue disorders 799
 
4.7%
General disorders and administration site conditions 755
 
4.5%
Neoplasms benign, malignant and unspecified (incl cysts and polyps) 749
 
4.4%
Musculoskeletal and connective tissue disorders 721
 
4.3%
Other values (18) 6768
40.0%

Length

2025-04-28T20:39:44.619122 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category
Value Count Frequency (%)
disorders 10994
 
17.6%
and 10296
 
16.5%
system 2610
 
4.2%
investigations 1648
 
2.6%
tissue 1520
 
2.4%
injury 1240
 
2.0%
complications 1240
 
2.0%
procedural 1240
 
2.0%
poisoning 1240
 
2.0%
nervous 1199
 
1.9%
Other values (52) 29228
46.8%

Most occurring characters

Value Count Frequency (%)
s 57630
10.6%
i 49043
 
9.0%
45514
 
8.4%
n 43563
 
8.0%
e 41721
 
7.7%
d 39674
 
7.3%
r 39605
 
7.3%
a 36697
 
6.8%
o 35588
 
6.6%
t 29568
 
5.5%
Other values (28) 123826
22.8%

Most occurring categories

Value Count Frequency (%)
(unknown) 542429
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
s 57630
10.6%
i 49043
 
9.0%
45514
 
8.4%
n 43563
 
8.0%
e 41721
 
7.7%
d 39674
 
7.3%
r 39605
 
7.3%
a 36697
 
6.8%
o 35588
 
6.6%
t 29568
 
5.5%
Other values (28) 123826
22.8%

Most occurring scripts

Value Count Frequency (%)
(unknown) 542429
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
s 57630
10.6%
i 49043
 
9.0%
45514
 
8.4%
n 43563
 
8.0%
e 41721
 
7.7%
d 39674
 
7.3%
r 39605
 
7.3%
a 36697
 
6.8%
o 35588
 
6.6%
t 29568
 
5.5%
Other values (28) 123826
22.8%

Most occurring blocks

Value Count Frequency (%)
(unknown) 542429
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
s 57630
10.6%
i 49043
 
9.0%
45514
 
8.4%
n 43563
 
8.0%
e 41721
 
7.7%
d 39674
 
7.3%
r 39605
 
7.3%
a 36697
 
6.8%
o 35588
 
6.6%
t 29568
 
5.5%
Other values (28) 123826
22.8%

meddra_concept_id
Real number (ℝ)

High correlation 

Distinct 10770
Distinct (%) 63.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 36596084
Minimum 788090
Maximum 46277190
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 132.5 KiB
2025-04-28T20:39:44.796552 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 788090
5-th percentile 35104718
Q1 35707652
median 36211132
Q3 36918848
95-th percentile 43562831
Maximum 46277190
Range 45489100
Interquartile range (IQR) 1211196

Descriptive statistics

Standard deviation 5218103.3
Coefficient of variation (CV) 0.14258639
Kurtosis 31.037889
Mean 36596084
Median Absolute Deviation (MAD) 603687
Skewness -4.4583286
Sum 6.1997425 × 1011
Variance 2.7228602 × 1013
Monotonicity Not monotonic
2025-04-28T20:39:45.163659 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
35305782 7
 
< 0.1%
35104213 7
 
< 0.1%
35708637 7
 
< 0.1%
35306164 7
 
< 0.1%
35205147 7
 
< 0.1%
35205157 7
 
< 0.1%
35305785 6
 
< 0.1%
35305794 6
 
< 0.1%
35305466 6
 
< 0.1%
42889057 6
 
< 0.1%
Other values (10760) 16875
99.6%
Value Count Frequency (%)
788090 1
 
< 0.1%
788094 1
 
< 0.1%
788095 1
 
< 0.1%
788096 2
< 0.1%
788098 2
< 0.1%
788100 2
< 0.1%
788104 1
 
< 0.1%
788105 3
< 0.1%
788115 1
 
< 0.1%
788120 3
< 0.1%
Value Count Frequency (%)
46277190 2
< 0.1%
46277169 2
< 0.1%
46277163 1
< 0.1%
46276846 1
< 0.1%
46276844 1
< 0.1%
46276840 2
< 0.1%
46276826 1
< 0.1%
46276825 1
< 0.1%
46276824 2
< 0.1%
46276815 2
< 0.1%

neventreports
Real number (ℝ)

Distinct 720
Distinct (%) 4.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 74.279086
Minimum 1
Maximum 16798
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 132.5 KiB
2025-04-28T20:39:45.330905 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 2
median 6
Q3 24
95-th percentile 295
Maximum 16798
Range 16797
Interquartile range (IQR) 22

Descriptive statistics

Standard deviation 388.00587
Coefficient of variation (CV) 5.223622
Kurtosis 450.99997
Mean 74.279086
Median Absolute Deviation (MAD) 5
Skewness 16.761267
Sum 1258362
Variance 150548.56
Monotonicity Not monotonic
2025-04-28T20:39:45.518684 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
1 3789
22.4%
2 1713
 
10.1%
3 1275
 
7.5%
4 867
 
5.1%
5 689
 
4.1%
6 640
 
3.8%
7 502
 
3.0%
8 442
 
2.6%
9 378
 
2.2%
10 290
 
1.7%
Other values (710) 6356
37.5%
Value Count Frequency (%)
1 3789
22.4%
2 1713
10.1%
3 1275
 
7.5%
4 867
 
5.1%
5 689
 
4.1%
6 640
 
3.8%
7 502
 
3.0%
8 442
 
2.6%
9 378
 
2.2%
10 290
 
1.7%
Value Count Frequency (%)
16798 1
< 0.1%
13250 1
< 0.1%
11598 1
< 0.1%
10425 1
< 0.1%
9281 1
< 0.1%
8969 1
< 0.1%
6233 1
< 0.1%
6040 1
< 0.1%
5973 1
< 0.1%
5780 1
< 0.1%

meddra_concept_class_id_1
Categorical

High correlation  Imbalance 

Distinct 2
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
PT
16938 
nan
 
3

Length

Max length 3
Median length 2
Mean length 2.0001771
Min length 2

Characters and Unicode

Total characters 33885
Distinct characters 4
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row nan
2nd row nan
3rd row nan
4th row PT
5th row PT

Common Values

Value Count Frequency (%)
PT 16938
> 99.9%
nan 3
 
< 0.1%

Length

2025-04-28T20:39:45.702455 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:39:45.838669 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
pt 16938
> 99.9%
nan 3
 
< 0.1%

Most occurring characters

Value Count Frequency (%)
P 16938
50.0%
T 16938
50.0%
n 6
 
< 0.1%
a 3
 
< 0.1%

Most occurring categories

Value Count Frequency (%)
(unknown) 33885
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
P 16938
50.0%
T 16938
50.0%
n 6
 
< 0.1%
a 3
 
< 0.1%

Most occurring scripts

Value Count Frequency (%)
(unknown) 33885
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
P 16938
50.0%
T 16938
50.0%
n 6
 
< 0.1%
a 3
 
< 0.1%

Most occurring blocks

Value Count Frequency (%)
(unknown) 33885
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
P 16938
50.0%
T 16938
50.0%
n 6
 
< 0.1%
a 3
 
< 0.1%

meddra_concept_class_id_2
Categorical

High correlation  Imbalance 

Distinct 2
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
HLT
16938 
nan
 
3

Length

Max length 3
Median length 3
Mean length 3
Min length 3

Characters and Unicode

Total characters 50823
Distinct characters 5
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row nan
2nd row nan
3rd row nan
4th row HLT
5th row HLT

Common Values

Value Count Frequency (%)
HLT 16938
> 99.9%
nan 3
 
< 0.1%

Length

2025-04-28T20:39:45.974607 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:39:46.131376 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
hlt 16938
> 99.9%
nan 3
 
< 0.1%

Most occurring characters

Value Count Frequency (%)
H 16938
33.3%
L 16938
33.3%
T 16938
33.3%
n 6
 
< 0.1%
a 3
 
< 0.1%

Most occurring categories

Value Count Frequency (%)
(unknown) 50823
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
H 16938
33.3%
L 16938
33.3%
T 16938
33.3%
n 6
 
< 0.1%
a 3
 
< 0.1%

Most occurring scripts

Value Count Frequency (%)
(unknown) 50823
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
H 16938
33.3%
L 16938
33.3%
T 16938
33.3%
n 6
 
< 0.1%
a 3
 
< 0.1%

Most occurring blocks

Value Count Frequency (%)
(unknown) 50823
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
H 16938
33.3%
L 16938
33.3%
T 16938
33.3%
n 6
 
< 0.1%
a 3
 
< 0.1%

meddra_concept_class_id_3
Categorical

High correlation  Imbalance 

Distinct 2
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
HLGT
16938 
nan
 
3

Length

Max length 4
Median length 4
Mean length 3.9998229
Min length 3

Characters and Unicode

Total characters 67761
Distinct characters 6
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row nan
2nd row nan
3rd row nan
4th row HLGT
5th row HLGT

Common Values

Value Count Frequency (%)
HLGT 16938
> 99.9%
nan 3
 
< 0.1%

Length

2025-04-28T20:39:46.269804 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:39:46.398984 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
hlgt 16938
> 99.9%
nan 3
 
< 0.1%

Most occurring characters

Value Count Frequency (%)
H 16938
25.0%
L 16938
25.0%
G 16938
25.0%
T 16938
25.0%
n 6
 
< 0.1%
a 3
 
< 0.1%

Most occurring categories

Value Count Frequency (%)
(unknown) 67761
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
H 16938
25.0%
L 16938
25.0%
G 16938
25.0%
T 16938
25.0%
n 6
 
< 0.1%
a 3
 
< 0.1%

Most occurring scripts

Value Count Frequency (%)
(unknown) 67761
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
H 16938
25.0%
L 16938
25.0%
G 16938
25.0%
T 16938
25.0%
n 6
 
< 0.1%
a 3
 
< 0.1%

Most occurring blocks

Value Count Frequency (%)
(unknown) 67761
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
H 16938
25.0%
L 16938
25.0%
G 16938
25.0%
T 16938
25.0%
n 6
 
< 0.1%
a 3
 
< 0.1%

meddra_concept_class_id_4
Categorical

High correlation  Imbalance 

Distinct 2
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
SOC
16938 
nan
 
3

Length

Max length 3
Median length 3
Mean length 3
Min length 3

Characters and Unicode

Total characters 50823
Distinct characters 5
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row nan
2nd row nan
3rd row nan
4th row SOC
5th row SOC

Common Values

Value Count Frequency (%)
SOC 16938
> 99.9%
nan 3
 
< 0.1%

Length

2025-04-28T20:39:46.526621 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:39:46.651629 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
soc 16938
> 99.9%
nan 3
 
< 0.1%

Most occurring characters

Value Count Frequency (%)
S 16938
33.3%
O 16938
33.3%
C 16938
33.3%
n 6
 
< 0.1%
a 3
 
< 0.1%

Most occurring categories

Value Count Frequency (%)
(unknown) 50823
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
S 16938
33.3%
O 16938
33.3%
C 16938
33.3%
n 6
 
< 0.1%
a 3
 
< 0.1%

Most occurring scripts

Value Count Frequency (%)
(unknown) 50823
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
S 16938
33.3%
O 16938
33.3%
C 16938
33.3%
n 6
 
< 0.1%
a 3
 
< 0.1%

Most occurring blocks

Value Count Frequency (%)
(unknown) 50823
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
S 16938
33.3%
O 16938
33.3%
C 16938
33.3%
n 6
 
< 0.1%
a 3
 
< 0.1%

meddra_concept_code_1
Real number (ℝ)

High correlation 

Distinct 10767
Distinct (%) 63.6%
Missing 3
Missing (%) < 0.1%
Infinite 0
Infinite (%) 0.0%
Mean 10044712
Minimum 10000021
Maximum 10078675
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 132.5 KiB
2025-04-28T20:39:46.800882 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 10000021
5-th percentile 10005334
Q1 10024387
median 10050214
Q3 10063422
95-th percentile 10074760
Maximum 10078675
Range 78654
Interquartile range (IQR) 39035

Descriptive statistics

Standard deviation 22515.539
Coefficient of variation (CV) 0.0022415316
Kurtosis -1.0872844
Mean 10044712
Median Absolute Deviation (MAD) 16172.5
Skewness -0.42468787
Sum 1.7013733 × 1011
Variance 5.069495 × 108
Monotonicity Not monotonic
2025-04-28T20:39:46.990371 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10053869 7
 
< 0.1%
10034872 7
 
< 0.1%
10004213 7
 
< 0.1%
10051707 7
 
< 0.1%
10063935 7
 
< 0.1%
10044689 7
 
< 0.1%
10069116 6
 
< 0.1%
10072010 6
 
< 0.1%
10067010 6
 
< 0.1%
10050361 6
 
< 0.1%
Other values (10757) 16872
99.6%
Value Count Frequency (%)
10000021 2
< 0.1%
10000028 1
< 0.1%
10000050 2
< 0.1%
10000059 1
< 0.1%
10000060 1
< 0.1%
10000077 1
< 0.1%
10000081 1
< 0.1%
10000084 1
< 0.1%
10000087 1
< 0.1%
10000090 1
< 0.1%
Value Count Frequency (%)
10078675 2
 
< 0.1%
10078668 2
 
< 0.1%
10078659 2
 
< 0.1%
10078651 2
 
< 0.1%
10078638 5
< 0.1%
10078602 1
 
< 0.1%
10078581 2
 
< 0.1%
10078580 1
 
< 0.1%
10078576 2
 
< 0.1%
10078575 1
 
< 0.1%

meddra_concept_code_2
Real number (ℝ)

High correlation 

Distinct 1619
Distinct (%) 9.6%
Missing 3
Missing (%) < 0.1%
Infinite 0
Infinite (%) 0.0%
Mean 10029846
Minimum 10000032
Maximum 10077699
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 132.5 KiB
2025-04-28T20:39:47.182795 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 10000032
5-th percentile 10003818
Q1 10016462
median 10027696
Q3 10040948
95-th percentile 10068755
Maximum 10077699
Range 77667
Interquartile range (IQR) 24486

Descriptive statistics

Standard deviation 18291.903
Coefficient of variation (CV) 0.0018237472
Kurtosis -0.16890637
Mean 10029846
Median Absolute Deviation (MAD) 13090
Skewness 0.57898279
Sum 1.6988553 × 1011
Variance 3.3459373 × 108
Monotonicity Not monotonic
2025-04-28T20:39:47.380173 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10021544 146
 
0.9%
10022097 126
 
0.7%
10004047 106
 
0.6%
10003057 102
 
0.6%
10018987 101
 
0.6%
10068753 92
 
0.5%
10027700 89
 
0.5%
10018072 82
 
0.5%
10027682 82
 
0.5%
10012424 82
 
0.5%
Other values (1609) 15930
94.0%
Value Count Frequency (%)
10000032 21
0.1%
10000063 2
 
< 0.1%
10000072 3
 
< 0.1%
10000117 7
 
< 0.1%
10000135 2
 
< 0.1%
10000171 9
0.1%
10000178 8
 
< 0.1%
10000190 6
 
< 0.1%
10000191 2
 
< 0.1%
10000192 11
0.1%
Value Count Frequency (%)
10077699 5
 
< 0.1%
10077550 6
 
< 0.1%
10077549 3
 
< 0.1%
10077548 30
0.2%
10077547 9
 
0.1%
10077545 2
 
< 0.1%
10077544 1
 
< 0.1%
10077542 1
 
< 0.1%
10077540 1
 
< 0.1%
10077538 2
 
< 0.1%

meddra_concept_code_3
Real number (ℝ)

High correlation 

Distinct 337
Distinct (%) 2.0%
Missing 3
Missing (%) < 0.1%
Infinite 0
Infinite (%) 0.0%
Mean 10026499
Minimum 10000073
Maximum 10077546
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 132.5 KiB
2025-04-28T20:39:47.603071 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 10000073
5-th percentile 10001708
Q1 10014623
median 10023213
Q3 10038612
95-th percentile 10065122
Maximum 10077546
Range 77473
Interquartile range (IQR) 23989

Descriptive statistics

Standard deviation 17669.616
Coefficient of variation (CV) 0.0017622917
Kurtosis 0.16329273
Mean 10026499
Median Absolute Deviation (MAD) 11908
Skewness 0.73123265
Sum 1.6982884 × 1011
Variance 3.1221532 × 108
Monotonicity Not monotonic
2025-04-28T20:39:47.845721 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10001316 560
 
3.3%
10021879 382
 
2.3%
10069888 359
 
2.1%
10014982 359
 
2.1%
10004018 282
 
1.7%
10047075 274
 
1.6%
10029305 269
 
1.6%
10022114 264
 
1.6%
10018851 249
 
1.5%
10047438 225
 
1.3%
Other values (327) 13715
81.0%
Value Count Frequency (%)
10000073 45
 
0.3%
10000211 20
 
0.1%
10000485 29
 
0.2%
10000546 64
 
0.4%
10001302 5
 
< 0.1%
10001316 560
3.3%
10001353 39
 
0.2%
10001474 8
 
< 0.1%
10001708 146
 
0.9%
10002086 40
 
0.2%
Value Count Frequency (%)
10077546 11
 
0.1%
10077537 59
 
0.3%
10076290 26
 
0.2%
10074469 4
 
< 0.1%
10072990 28
 
0.2%
10071947 105
 
0.6%
10071940 65
 
0.4%
10069888 359
2.1%
10069782 38
 
0.2%
10069781 59
 
0.3%

meddra_concept_code_4
Real number (ℝ)

High correlation 

Distinct 27
Distinct (%) 0.2%
Missing 3
Missing (%) < 0.1%
Infinite 0
Infinite (%) 0.0%
Mean 10027122
Minimum 10005329
Maximum 10077536
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 132.5 KiB
2025-04-28T20:39:48.019500 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 10005329
5-th percentile 10007541
Q1 10019805
median 10022891
Q3 10037175
95-th percentile 10047065
Maximum 10077536
Range 72207
Interquartile range (IQR) 17370

Descriptive statistics

Standard deviation 11351.636
Coefficient of variation (CV) 0.0011320932
Kurtosis 1.0165171
Mean 10027122
Median Absolute Deviation (MAD) 6314
Skewness 0.55127251
Sum 1.6983939 × 1011
Variance 1.2885964 × 108
Monotonicity Not monotonic
2025-04-28T20:39:48.184157 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
Value Count Frequency (%)
10022891 1648
 
9.7%
10022117 1240
 
7.3%
10029205 1199
 
7.1%
10021881 1147
 
6.8%
10047065 967
 
5.7%
10017947 948
 
5.6%
10040785 799
 
4.7%
10018065 755
 
4.5%
10029104 749
 
4.4%
10028395 721
 
4.3%
Other values (17) 6765
39.9%
Value Count Frequency (%)
10005329 485
2.9%
10007541 393
2.3%
10010331 615
3.6%
10013993 98
 
0.6%
10014698 251
 
1.5%
10015919 514
3.0%
10017947 948
5.6%
10018065 755
4.5%
10019805 236
 
1.4%
10021428 450
2.7%
Value Count Frequency (%)
10077536 97
 
0.6%
10047065 967
5.7%
10042613 576
3.4%
10041244 122
 
0.7%
10040785 799
4.7%
10038738 672
4.0%
10038604 476
2.8%
10038359 407
2.4%
10037175 566
3.3%
10036585 365
 
2.2%

meddra_concept_id_2
Real number (ℝ)

High correlation 

Distinct 1619
Distinct (%) 9.6%
Missing 3
Missing (%) < 0.1%
Infinite 0
Infinite (%) 0.0%
Mean 36430341
Minimum 788073
Maximum 45885357
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 132.5 KiB
2025-04-28T20:39:48.367964 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 788073
5-th percentile 35202463
Q1 35802820
median 36303167
Q3 37003673
95-th percentile 37604043
Maximum 45885357
Range 45097284
Interquartile range (IQR) 1200853

Descriptive statistics

Standard deviation 3065518.4
Coefficient of variation (CV) 0.084147398
Kurtosis 97.877037
Mean 36430341
Median Absolute Deviation (MAD) 600384
Skewness -7.9705946
Sum 6.1705712 × 1011
Variance 9.3974032 × 1012
Monotonicity Not monotonic
2025-04-28T20:39:48.554780 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
35802818 146
 
0.9%
35802820 126
 
0.7%
36102898 106
 
0.6%
35802817 102
 
0.6%
37604043 101
 
0.6%
35802819 92
 
0.5%
37503992 89
 
0.5%
35802829 82
 
0.5%
36002886 82
 
0.5%
37303796 82
 
0.5%
Other values (1609) 15930
94.0%
Value Count Frequency (%)
788073 23
0.1%
788074 4
 
< 0.1%
788075 5
 
< 0.1%
788076 2
 
< 0.1%
788078 1
 
< 0.1%
788080 1
 
< 0.1%
788082 1
 
< 0.1%
788083 2
 
< 0.1%
788084 9
 
0.1%
788085 30
0.2%
Value Count Frequency (%)
45885357 18
0.1%
45885356 3
 
< 0.1%
45885355 3
 
< 0.1%
45885354 2
 
< 0.1%
45885353 23
0.1%
45885352 23
0.1%
45885351 9
 
0.1%
45885349 13
0.1%
45885348 21
0.1%
45885347 20
0.1%

meddra_concept_id_3
Real number (ℝ)

High correlation 

Distinct 337
Distinct (%) 2.0%
Missing 3
Missing (%) < 0.1%
Infinite 0
Infinite (%) 0.0%
Mean 36501597
Minimum 788071
Maximum 45885340
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 132.5 KiB
2025-04-28T20:39:48.732140 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 788071
5-th percentile 35202053
Q1 35802128
median 36302182
Q3 37102294
95-th percentile 37602361
Maximum 45885340
Range 45097269
Interquartile range (IQR) 1300166

Descriptive statistics

Standard deviation 2739892.8
Coefficient of variation (CV) 0.075062272
Kurtosis 117.74614
Mean 36501597
Median Absolute Deviation (MAD) 600064
Skewness -8.6002069
Sum 6.1826405 × 1011
Variance 7.5070125 × 1012
Monotonicity Not monotonic
2025-04-28T20:39:48.914316 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
35802128 560
 
3.3%
36102149 382
 
2.3%
42888893 359
 
2.1%
37302323 359
 
2.1%
36102144 282
 
1.7%
37602361 274
 
1.6%
36702247 269
 
1.6%
36202157 264
 
1.6%
36302167 249
 
1.5%
36102154 225
 
1.3%
Other values (327) 13715
81.0%
Value Count Frequency (%)
788071 59
0.3%
788072 11
 
0.1%
35102033 40
0.2%
35102034 55
0.3%
35102035 30
 
0.2%
35102036 56
0.3%
35102037 10
 
0.1%
35102038 34
 
0.2%
35102039 98
0.6%
35102040 12
 
0.1%
Value Count Frequency (%)
45885340 26
 
0.2%
45885339 4
 
< 0.1%
43053687 28
 
0.2%
42888894 65
 
0.4%
42888893 359
2.1%
42888892 105
 
0.6%
42888891 38
 
0.2%
42888890 59
 
0.3%
37602365 19
 
0.1%
37602364 7
 
< 0.1%

meddra_concept_id_4
Real number (ℝ)

High correlation 

Distinct 27
Distinct (%) 0.2%
Missing 3
Missing (%) < 0.1%
Infinite 0
Infinite (%) 0.0%
Mean 36200516
Minimum 788070
Maximum 37600000
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 132.5 KiB
2025-04-28T20:39:49.249586 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 788070
5-th percentile 35200000
Q1 35900000
median 36300000
Q3 36900000
95-th percentile 37600000
Maximum 37600000
Range 36811930
Interquartile range (IQR) 1000000

Descriptive statistics

Standard deviation 2772093
Coefficient of variation (CV) 0.076576063
Kurtosis 149.58382
Mean 36200516
Median Absolute Deviation (MAD) 500000
Skewness -11.926897
Sum 6.1316434 × 1011
Variance 7.6844996 × 1012
Monotonicity Not monotonic
2025-04-28T20:39:49.416402 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
Value Count Frequency (%)
36300000 1648
 
9.7%
36200000 1240
 
7.3%
36700000 1199
 
7.1%
36100000 1147
 
6.8%
37600000 967
 
5.7%
35700000 948
 
5.6%
37300000 799
 
4.7%
35800000 755
 
4.5%
36600000 749
 
4.4%
36500000 721
 
4.3%
Other values (17) 6765
39.9%
Value Count Frequency (%)
788070 97
 
0.6%
35100000 485
2.9%
35200000 393
2.3%
35300000 615
3.6%
35400000 98
 
0.6%
35500000 251
 
1.5%
35600000 514
3.0%
35700000 948
5.6%
35800000 755
4.5%
35900000 236
 
1.4%
Value Count Frequency (%)
37600000 967
5.7%
37500000 576
3.4%
37400000 122
 
0.7%
37300000 799
4.7%
37200000 672
4.0%
37100000 476
 
2.8%
37000000 407
 
2.4%
36900000 566
3.3%
36800000 365
 
2.2%
36700000 1199
7.1%
Distinct 10768
Distinct (%) 63.6%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
2025-04-28T20:39:49.770745 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 93
Median length 57
Mean length 21.634496
Min length 3

Characters and Unicode

Total characters 366510
Distinct characters 70
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 5906 ?
Unique (%) 34.9%

Sample

1st row nan
2nd row nan
3rd row nan
4th row Gelatinous transformation of the bone marrow
5th row Eosinophilic granulomatosis with polyangiitis
Value Count Frequency (%)
site 891
 
2.1%
syndrome 711
 
1.7%
infection 508
 
1.2%
disorder 462
 
1.1%
increased 427
 
1.0%
abnormal 425
 
1.0%
of 384
 
0.9%
congenital 349
 
0.8%
decreased 325
 
0.8%
blood 307
 
0.7%
Other values (5573) 37447
88.7%
2025-04-28T20:39:50.364704 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
e 34008
 
9.3%
i 31298
 
8.5%
a 31002
 
8.5%
o 25615
 
7.0%
25295
 
6.9%
r 24750
 
6.8%
t 23930
 
6.5%
n 23155
 
6.3%
s 21736
 
5.9%
l 17347
 
4.7%
Other values (60) 108374
29.6%

Most occurring categories

Value Count Frequency (%)
(unknown) 366510
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
e 34008
 
9.3%
i 31298
 
8.5%
a 31002
 
8.5%
o 25615
 
7.0%
25295
 
6.9%
r 24750
 
6.8%
t 23930
 
6.5%
n 23155
 
6.3%
s 21736
 
5.9%
l 17347
 
4.7%
Other values (60) 108374
29.6%

Most occurring scripts

Value Count Frequency (%)
(unknown) 366510
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
e 34008
 
9.3%
i 31298
 
8.5%
a 31002
 
8.5%
o 25615
 
7.0%
25295
 
6.9%
r 24750
 
6.8%
t 23930
 
6.5%
n 23155
 
6.3%
s 21736
 
5.9%
l 17347
 
4.7%
Other values (60) 108374
29.6%

Most occurring blocks

Value Count Frequency (%)
(unknown) 366510
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
e 34008
 
9.3%
i 31298
 
8.5%
a 31002
 
8.5%
o 25615
 
7.0%
25295
 
6.9%
r 24750
 
6.8%
t 23930
 
6.5%
n 23155
 
6.3%
s 21736
 
5.9%
l 17347
 
4.7%
Other values (60) 108374
29.6%
Distinct 1620
Distinct (%) 9.6%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
2025-04-28T20:39:50.704765 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 82
Median length 62
Mean length 32.316864
Min length 3

Characters and Unicode

Total characters 547480
Distinct characters 57
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 148 ?
Unique (%) 0.9%

Sample

1st row nan
2nd row nan
3rd row nan
4th row Marrow depression and hypoplastic anaemias
5th row Eosinophilic disorders
Value Count Frequency (%)
and 6447
 
9.7%
nec 5508
 
8.3%
disorders 3918
 
5.9%
infections 1947
 
2.9%
neoplasms 1018
 
1.5%
analyses 897
 
1.3%
procedures 842
 
1.3%
congenital 827
 
1.2%
tissue 751
 
1.1%
site 749
 
1.1%
Other values (1276) 43804
65.7%
2025-04-28T20:39:51.273169 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
49767
 
9.1%
e 45308
 
8.3%
i 44405
 
8.1%
s 44390
 
8.1%
a 43095
 
7.9%
n 39093
 
7.1%
o 34406
 
6.3%
r 33723
 
6.2%
t 31507
 
5.8%
c 23806
 
4.3%
Other values (47) 157980
28.9%

Most occurring categories

Value Count Frequency (%)
(unknown) 547480
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
49767
 
9.1%
e 45308
 
8.3%
i 44405
 
8.1%
s 44390
 
8.1%
a 43095
 
7.9%
n 39093
 
7.1%
o 34406
 
6.3%
r 33723
 
6.2%
t 31507
 
5.8%
c 23806
 
4.3%
Other values (47) 157980
28.9%

Most occurring scripts

Value Count Frequency (%)
(unknown) 547480
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
49767
 
9.1%
e 45308
 
8.3%
i 44405
 
8.1%
s 44390
 
8.1%
a 43095
 
7.9%
n 39093
 
7.1%
o 34406
 
6.3%
r 33723
 
6.2%
t 31507
 
5.8%
c 23806
 
4.3%
Other values (47) 157980
28.9%

Most occurring blocks

Value Count Frequency (%)
(unknown) 547480
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
49767
 
9.1%
e 45308
 
8.3%
i 44405
 
8.1%
s 44390
 
8.1%
a 43095
 
7.9%
n 39093
 
7.1%
o 34406
 
6.3%
r 33723
 
6.2%
t 31507
 
5.8%
c 23806
 
4.3%
Other values (47) 157980
28.9%
Distinct 338
Distinct (%) 2.0%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
2025-04-28T20:39:51.624489 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 86
Median length 56
Mean length 35.634673
Min length 3

Characters and Unicode

Total characters 603687
Distinct characters 53
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 3 ?
Unique (%) < 0.1%

Sample

1st row nan
2nd row nan
3rd row nan
4th row Anaemias nonhaemolytic and marrow depression
5th row White blood cell disorders
Value Count Frequency (%)
and 7893
 
11.2%
disorders 7052
 
10.0%
nec 2836
 
4.0%
conditions 1529
 
2.2%
investigations 1417
 
2.0%
vascular 1299
 
1.8%
excl 1289
 
1.8%
infections 1185
 
1.7%
neoplasms 1019
 
1.4%
congenital 1001
 
1.4%
Other values (431) 43802
62.3%
2025-04-28T20:39:52.183788 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
53381
 
8.8%
i 52858
 
8.8%
e 51012
 
8.5%
s 50599
 
8.4%
n 44386
 
7.4%
a 43177
 
7.2%
r 41263
 
6.8%
o 39926
 
6.6%
t 35080
 
5.8%
d 32479
 
5.4%
Other values (43) 159526
26.4%

Most occurring categories

Value Count Frequency (%)
(unknown) 603687
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
53381
 
8.8%
i 52858
 
8.8%
e 51012
 
8.5%
s 50599
 
8.4%
n 44386
 
7.4%
a 43177
 
7.2%
r 41263
 
6.8%
o 39926
 
6.6%
t 35080
 
5.8%
d 32479
 
5.4%
Other values (43) 159526
26.4%

Most occurring scripts

Value Count Frequency (%)
(unknown) 603687
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
53381
 
8.8%
i 52858
 
8.8%
e 51012
 
8.5%
s 50599
 
8.4%
n 44386
 
7.4%
a 43177
 
7.2%
r 41263
 
6.8%
o 39926
 
6.6%
t 35080
 
5.8%
d 32479
 
5.4%
Other values (43) 159526
26.4%

Most occurring blocks

Value Count Frequency (%)
(unknown) 603687
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
53381
 
8.8%
i 52858
 
8.8%
e 51012
 
8.5%
s 50599
 
8.4%
n 44386
 
7.4%
a 43177
 
7.2%
r 41263
 
6.8%
o 39926
 
6.6%
t 35080
 
5.8%
d 32479
 
5.4%
Other values (43) 159526
26.4%

relationship_id_12
Categorical

High correlation  Imbalance 

Distinct 2
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
Is a
16938 
nan
 
3

Length

Max length 4
Median length 4
Mean length 3.9998229
Min length 3

Characters and Unicode

Total characters 67761
Distinct characters 5
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row nan
2nd row nan
3rd row nan
4th row Is a
5th row Is a

Common Values

Value Count Frequency (%)
Is a 16938
> 99.9%
nan 3
 
< 0.1%

Length

2025-04-28T20:39:52.367300 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:39:52.498808 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
is 16938
50.0%
a 16938
50.0%
nan 3
 
< 0.1%

Most occurring characters

Value Count Frequency (%)
a 16941
25.0%
I 16938
25.0%
s 16938
25.0%
16938
25.0%
n 6
 
< 0.1%

Most occurring categories

Value Count Frequency (%)
(unknown) 67761
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
a 16941
25.0%
I 16938
25.0%
s 16938
25.0%
16938
25.0%
n 6
 
< 0.1%

Most occurring scripts

Value Count Frequency (%)
(unknown) 67761
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
a 16941
25.0%
I 16938
25.0%
s 16938
25.0%
16938
25.0%
n 6
 
< 0.1%

Most occurring blocks

Value Count Frequency (%)
(unknown) 67761
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
a 16941
25.0%
I 16938
25.0%
s 16938
25.0%
16938
25.0%
n 6
 
< 0.1%

relationship_id_23
Categorical

High correlation  Imbalance 

Distinct 2
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
Is a
16938 
nan
 
3

Length

Max length 4
Median length 4
Mean length 3.9998229
Min length 3

Characters and Unicode

Total characters 67761
Distinct characters 5
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row nan
2nd row nan
3rd row nan
4th row Is a
5th row Is a

Common Values

Value Count Frequency (%)
Is a 16938
> 99.9%
nan 3
 
< 0.1%

Length

2025-04-28T20:39:52.641621 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:39:52.775947 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
is 16938
50.0%
a 16938
50.0%
nan 3
 
< 0.1%

Most occurring characters

Value Count Frequency (%)
a 16941
25.0%
I 16938
25.0%
s 16938
25.0%
16938
25.0%
n 6
 
< 0.1%

Most occurring categories

Value Count Frequency (%)
(unknown) 67761
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
a 16941
25.0%
I 16938
25.0%
s 16938
25.0%
16938
25.0%
n 6
 
< 0.1%

Most occurring scripts

Value Count Frequency (%)
(unknown) 67761
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
a 16941
25.0%
I 16938
25.0%
s 16938
25.0%
16938
25.0%
n 6
 
< 0.1%

Most occurring blocks

Value Count Frequency (%)
(unknown) 67761
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
a 16941
25.0%
I 16938
25.0%
s 16938
25.0%
16938
25.0%
n 6
 
< 0.1%

relationship_id_34
Categorical

High correlation  Imbalance 

Distinct 2
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
Is a
16938 
nan
 
3

Length

Max length 4
Median length 4
Mean length 3.9998229
Min length 3

Characters and Unicode

Total characters 67761
Distinct characters 5
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row nan
2nd row nan
3rd row nan
4th row Is a
5th row Is a

Common Values

Value Count Frequency (%)
Is a 16938
> 99.9%
nan 3
 
< 0.1%

Length

2025-04-28T20:39:52.924744 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:39:53.061598 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
is 16938
50.0%
a 16938
50.0%
nan 3
 
< 0.1%

Most occurring characters

Value Count Frequency (%)
a 16941
25.0%
I 16938
25.0%
s 16938
25.0%
16938
25.0%
n 6
 
< 0.1%

Most occurring categories

Value Count Frequency (%)
(unknown) 67761
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
a 16941
25.0%
I 16938
25.0%
s 16938
25.0%
16938
25.0%
n 6
 
< 0.1%

Most occurring scripts

Value Count Frequency (%)
(unknown) 67761
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
a 16941
25.0%
I 16938
25.0%
s 16938
25.0%
16938
25.0%
n 6
 
< 0.1%

Most occurring blocks

Value Count Frequency (%)
(unknown) 67761
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
a 16941
25.0%
I 16938
25.0%
s 16938
25.0%
16938
25.0%
n 6
 
< 0.1%

soc_category
Categorical

High correlation 

Distinct 9
Distinct (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
anatomic_site_disorder
9487 
procedural_disorder
1816 
biological_disorder
1806 
lab_test_disorder
1648 
infection_disorder
1147 
Other values (4)
1037 

Length

Max length 22
Median length 22
Mean length 20.309545
Min length 3

Characters and Unicode

Total characters 344064
Distinct characters 19
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row nan
2nd row nan
3rd row nan
4th row anatomic_site_disorder
5th row anatomic_site_disorder

Common Values

Value Count Frequency (%)
anatomic_site_disorder 9487
56.0%
procedural_disorder 1816
 
10.7%
biological_disorder 1806
 
10.7%
lab_test_disorder 1648
 
9.7%
infection_disorder 1147
 
6.8%
immune_system_disorder 450
 
2.7%
foreign_disorder 365
 
2.2%
social_disorder 122
 
0.7%
nan 100
 
0.6%

Length

2025-04-28T20:39:53.217540 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:39:53.395469 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
anatomic_site_disorder 9487
56.0%
procedural_disorder 1816
 
10.7%
biological_disorder 1806
 
10.7%
lab_test_disorder 1648
 
9.7%
infection_disorder 1147
 
6.8%
immune_system_disorder 450
 
2.7%
foreign_disorder 365
 
2.2%
social_disorder 122
 
0.7%
nan 100
 
0.6%

Most occurring characters

Value Count Frequency (%)
i 42658
12.4%
r 37679
11.0%
d 35498
10.3%
o 33390
9.7%
e 32204
9.4%
s 28998
8.4%
_ 28426
8.3%
a 24466
7.1%
t 23867
6.9%
c 14378
 
4.2%
Other values (9) 42500
12.4%

Most occurring categories

Value Count Frequency (%)
(unknown) 344064
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
i 42658
12.4%
r 37679
11.0%
d 35498
10.3%
o 33390
9.7%
e 32204
9.4%
s 28998
8.4%
_ 28426
8.3%
a 24466
7.1%
t 23867
6.9%
c 14378
 
4.2%
Other values (9) 42500
12.4%

Most occurring scripts

Value Count Frequency (%)
(unknown) 344064
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
i 42658
12.4%
r 37679
11.0%
d 35498
10.3%
o 33390
9.7%
e 32204
9.4%
s 28998
8.4%
_ 28426
8.3%
a 24466
7.1%
t 23867
6.9%
c 14378
 
4.2%
Other values (9) 42500
12.4%

Most occurring blocks

Value Count Frequency (%)
(unknown) 344064
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
i 42658
12.4%
r 37679
11.0%
d 35498
10.3%
o 33390
9.7%
e 32204
9.4%
s 28998
8.4%
_ 28426
8.3%
a 24466
7.1%
t 23867
6.9%
c 14378
 
4.2%
Other values (9) 42500
12.4%

pediatric_adverse_event
Categorical

Imbalance 

Distinct 2
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 132.5 KiB
0
16220 
1
 
721

Length

Max length 1
Median length 1
Mean length 1
Min length 1

Characters and Unicode

Total characters 16941
Distinct characters 2
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Common Values

Value Count Frequency (%)
0 16220
95.7%
1 721
 
4.3%

Length

2025-04-28T20:39:53.584975 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T20:39:53.715647 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Value Count Frequency (%)
0 16220
95.7%
1 721
 
4.3%

Most occurring characters

Value Count Frequency (%)
0 16220
95.7%
1 721
 
4.3%

Most occurring categories

Value Count Frequency (%)
(unknown) 16941
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
0 16220
95.7%
1 721
 
4.3%

Most occurring scripts

Value Count Frequency (%)
(unknown) 16941
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
0 16220
95.7%
1 721
 
4.3%

Most occurring blocks

Value Count Frequency (%)
(unknown) 16941
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
0 16220
95.7%
1 721
 
4.3%

Interactions

2025-04-28T20:39:42.281715 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:31.974200 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:33.164408 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:34.594197 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:35.887392 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:37.164478 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:38.450580 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:39.923018 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:41.111203 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:42.407768 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:32.098067 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:33.485432 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:34.725143 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:36.018067 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:37.299311 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:38.586437 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:40.036142 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:41.224820 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:42.550821 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:32.231994 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:33.624818 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:34.874233 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:36.163763 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:37.442906 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:38.725152 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:40.175974 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:41.360254 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:42.704114 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:32.377695 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:33.770934 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:35.022760 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:36.315030 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:37.601265 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:38.897468 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:40.317615 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:41.496768 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:42.852886 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:32.520696 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:33.917888 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:35.176107 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:36.464734 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:37.754475 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:39.044588 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:40.462011 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:41.646525 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:42.997035 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:32.662906 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:34.063536 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:35.332568 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:36.616414 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:37.901403 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:39.387467 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:40.609612 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:41.788107 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:43.140768 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:32.795627 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:34.208233 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:35.475760 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:36.762869 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:38.047448 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:39.527778 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:40.745346 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:41.921337 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:43.267097 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:32.910542 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:34.333863 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:35.606945 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:36.891880 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:38.174660 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:39.653749 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:40.862593 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:42.032474 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:43.393032 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:33.028332 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:34.456119 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:35.740305 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:37.021539 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:38.309876 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:39.782312 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:40.976767 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:42.150154 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Correlations

2025-04-28T20:39:53.814897 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
meddra_concept_id neventreports meddra_concept_code_1 meddra_concept_code_2 meddra_concept_code_3 meddra_concept_code_4 meddra_concept_id_2 meddra_concept_id_3 meddra_concept_id_4 pediatric_adverse_event
meddra_concept_id 1.000 -0.004 0.078 0.040 0.011 0.053 0.045 0.061 0.008 0.027
neventreports -0.004 1.000 -0.107 0.005 0.017 0.003 0.022 0.014 -0.010 -0.013
meddra_concept_code_1 0.078 -0.107 1.000 0.145 0.094 0.037 0.008 0.008 -0.071 0.009
meddra_concept_code_2 0.040 0.005 0.145 1.000 0.311 0.087 0.046 0.005 -0.151 -0.027
meddra_concept_code_3 0.011 0.017 0.094 0.311 1.000 0.132 0.016 0.121 -0.180 -0.008
meddra_concept_code_4 0.053 0.003 0.037 0.087 0.132 1.000 0.163 0.004 -0.099 0.033
meddra_concept_id_2 0.045 0.022 0.008 0.046 0.016 0.163 1.000 0.267 0.054 0.021
meddra_concept_id_3 0.061 0.014 0.008 0.005 0.121 0.004 0.267 1.000 0.568 0.001
meddra_concept_id_4 0.008 -0.010 -0.071 -0.151 -0.180 -0.099 0.054 0.568 1.000 0.017
pediatric_adverse_event 0.027 -0.013 0.009 -0.027 -0.008 0.033 0.021 0.001 0.017 1.000
2025-04-28T20:39:54.018997 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
meddra_concept_id neventreports meddra_concept_code_1 meddra_concept_code_2 meddra_concept_code_3 meddra_concept_code_4 meddra_concept_id_2 meddra_concept_id_3 meddra_concept_id_4 pediatric_adverse_event
meddra_concept_id 1.000 -0.060 0.262 0.103 0.083 0.446 0.488 0.475 0.440 0.040
neventreports -0.060 1.000 -0.246 -0.009 -0.019 -0.004 0.012 0.000 -0.012 0.007
meddra_concept_code_1 0.262 -0.246 1.000 0.123 0.056 0.011 0.032 0.047 -0.013 0.016
meddra_concept_code_2 0.103 -0.009 0.123 1.000 0.292 0.069 0.153 0.117 0.037 -0.024
meddra_concept_code_3 0.083 -0.019 0.056 0.292 1.000 0.129 0.177 0.260 0.095 0.009
meddra_concept_code_4 0.446 -0.004 0.011 0.069 0.129 1.000 0.867 0.856 0.966 0.032
meddra_concept_id_2 0.488 0.012 0.032 0.153 0.177 0.867 1.000 0.906 0.865 0.032
meddra_concept_id_3 0.475 0.000 0.047 0.117 0.260 0.856 0.906 1.000 0.864 0.019
meddra_concept_id_4 0.440 -0.012 -0.013 0.037 0.095 0.966 0.865 0.864 1.000 0.035
pediatric_adverse_event 0.040 0.007 0.016 -0.024 0.009 0.032 0.032 0.019 0.035 1.000
2025-04-28T20:39:54.218493 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
meddra_concept_id neventreports meddra_concept_code_1 meddra_concept_code_2 meddra_concept_code_3 meddra_concept_code_4 meddra_concept_id_2 meddra_concept_id_3 meddra_concept_id_4 pediatric_adverse_event
meddra_concept_id 1.000 -0.041 0.188 0.071 0.066 0.386 0.436 0.420 0.382 0.033
neventreports -0.041 1.000 -0.171 -0.006 -0.013 -0.003 0.008 0.000 -0.009 0.006
meddra_concept_code_1 0.188 -0.171 1.000 0.087 0.039 0.008 0.022 0.032 -0.008 0.013
meddra_concept_code_2 0.071 -0.006 0.087 1.000 0.232 0.050 0.109 0.083 0.028 -0.020
meddra_concept_code_3 0.066 -0.013 0.039 0.232 1.000 0.106 0.146 0.207 0.083 0.008
meddra_concept_code_4 0.386 -0.003 0.008 0.050 0.106 1.000 0.826 0.825 0.976 0.026
meddra_concept_id_2 0.436 0.008 0.022 0.109 0.146 0.826 1.000 0.891 0.825 0.026
meddra_concept_id_3 0.420 0.000 0.032 0.083 0.207 0.825 0.891 1.000 0.831 0.015
meddra_concept_id_4 0.382 -0.009 -0.008 0.028 0.083 0.976 0.825 0.831 1.000 0.029
pediatric_adverse_event 0.033 0.006 0.013 -0.020 0.008 0.026 0.026 0.015 0.029 1.000
2025-04-28T20:39:54.437083 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
meddra_concept_name_4 meddra_concept_id neventreports meddra_concept_class_id_1 meddra_concept_class_id_2 meddra_concept_class_id_3 meddra_concept_class_id_4 meddra_concept_code_1 meddra_concept_code_2 meddra_concept_code_3 meddra_concept_code_4 meddra_concept_id_2 meddra_concept_id_3 meddra_concept_id_4 relationship_id_12 relationship_id_23 relationship_id_34 soc_category pediatric_adverse_event
meddra_concept_name_4 1.000 0.732 0.055 1.000 1.000 1.000 1.000 0.323 0.620 0.780 1.000 0.822 0.926 1.000 1.000 1.000 1.000 1.000 0.449
meddra_concept_id 0.732 1.000 0.017 0.000 0.000 0.000 0.000 0.700 0.157 0.211 0.423 0.609 0.609 0.101 0.000 0.000 0.000 0.374 0.116
neventreports 0.055 0.017 1.000 0.000 0.000 0.000 0.000 0.059 0.038 0.045 0.019 0.046 0.047 0.000 0.000 0.000 0.000 0.000 0.000
meddra_concept_class_id_1 1.000 0.000 0.000 1.000 0.966 0.966 0.966 NaN NaN NaN NaN NaN NaN NaN 0.966 0.966 0.966 0.172 0.000
meddra_concept_class_id_2 1.000 0.000 0.000 0.966 1.000 0.966 0.966 NaN NaN NaN NaN NaN NaN NaN 0.966 0.966 0.966 0.172 0.000
meddra_concept_class_id_3 1.000 0.000 0.000 0.966 0.966 1.000 0.966 NaN NaN NaN NaN NaN NaN NaN 0.966 0.966 0.966 0.172 0.000
meddra_concept_class_id_4 1.000 0.000 0.000 0.966 0.966 0.966 1.000 NaN NaN NaN NaN NaN NaN NaN 0.966 0.966 0.966 0.172 0.000
meddra_concept_code_1 0.323 0.700 0.059 NaN NaN NaN NaN 1.000 0.427 0.295 0.171 0.173 0.191 0.165 NaN NaN NaN 0.191 0.108
meddra_concept_code_2 0.620 0.157 0.038 NaN NaN NaN NaN 0.427 1.000 0.781 0.391 0.742 0.434 0.435 NaN NaN NaN 0.380 0.104
meddra_concept_code_3 0.780 0.211 0.045 NaN NaN NaN NaN 0.295 0.781 1.000 0.551 0.539 0.782 0.384 NaN NaN NaN 0.498 0.152
meddra_concept_code_4 1.000 0.423 0.019 NaN NaN NaN NaN 0.171 0.391 0.551 1.000 0.684 0.802 1.000 NaN NaN NaN 0.817 0.118
meddra_concept_id_2 0.822 0.609 0.046 NaN NaN NaN NaN 0.173 0.742 0.539 0.684 1.000 0.924 0.254 NaN NaN NaN 0.423 0.044
meddra_concept_id_3 0.926 0.609 0.047 NaN NaN NaN NaN 0.191 0.434 0.782 0.802 0.924 1.000 0.910 NaN NaN NaN 0.711 0.058
meddra_concept_id_4 1.000 0.101 0.000 NaN NaN NaN NaN 0.165 0.435 0.384 1.000 0.254 0.910 1.000 NaN NaN NaN 1.000 0.011
relationship_id_12 1.000 0.000 0.000 0.966 0.966 0.966 0.966 NaN NaN NaN NaN NaN NaN NaN 1.000 0.966 0.966 0.172 0.000
relationship_id_23 1.000 0.000 0.000 0.966 0.966 0.966 0.966 NaN NaN NaN NaN NaN NaN NaN 0.966 1.000 0.966 0.172 0.000
relationship_id_34 1.000 0.000 0.000 0.966 0.966 0.966 0.966 NaN NaN NaN NaN NaN NaN NaN 0.966 0.966 1.000 0.172 0.000
soc_category 1.000 0.374 0.000 0.172 0.172 0.172 0.172 0.191 0.380 0.498 0.817 0.423 0.711 1.000 0.172 0.172 0.172 1.000 0.351
pediatric_adverse_event 0.449 0.116 0.000 0.000 0.000 0.000 0.000 0.108 0.104 0.152 0.118 0.044 0.058 0.011 0.000 0.000 0.000 0.351 1.000
2025-04-28T20:39:54.700778 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
meddra_concept_class_id_1 meddra_concept_class_id_2 meddra_concept_class_id_3 meddra_concept_class_id_4 meddra_concept_name_4 pediatric_adverse_event relationship_id_12 relationship_id_23 relationship_id_34 soc_category
meddra_concept_class_id_1 1.000 0.833 0.833 0.833 0.999 0.000 0.833 0.833 0.833 0.171
meddra_concept_class_id_2 0.833 1.000 0.833 0.833 0.999 0.000 0.833 0.833 0.833 0.171
meddra_concept_class_id_3 0.833 0.833 1.000 0.833 0.999 0.000 0.833 0.833 0.833 0.171
meddra_concept_class_id_4 0.833 0.833 0.833 1.000 0.999 0.000 0.833 0.833 0.833 0.171
meddra_concept_name_4 0.999 0.999 0.999 0.999 1.000 0.357 0.999 0.999 0.999 0.999
pediatric_adverse_event 0.000 0.000 0.000 0.000 0.357 1.000 0.000 0.000 0.000 0.351
relationship_id_12 0.833 0.833 0.833 0.833 0.999 0.000 1.000 0.833 0.833 0.171
relationship_id_23 0.833 0.833 0.833 0.833 0.999 0.000 0.833 1.000 0.833 0.171
relationship_id_34 0.833 0.833 0.833 0.833 0.999 0.000 0.833 0.833 1.000 0.171
soc_category 0.171 0.171 0.171 0.171 0.999 0.351 0.171 0.171 0.171 1.000

Missing values

2025-04-28T20:39:43.613224 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
A simple visualization of nullity by column.
2025-04-28T20:39:44.039396 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-04-28T20:39:44.347872 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

meddra_concept_name_4 meddra_concept_id neventreports meddra_concept_class_id_1 meddra_concept_class_id_2 meddra_concept_class_id_3 meddra_concept_class_id_4 meddra_concept_code_1 meddra_concept_code_2 meddra_concept_code_3 meddra_concept_code_4 meddra_concept_id_2 meddra_concept_id_3 meddra_concept_id_4 meddra_concept_name_1 meddra_concept_name_2 meddra_concept_name_3 relationship_id_12 relationship_id_23 relationship_id_34 soc_category pediatric_adverse_event
0 nan 35305812 3 nan nan nan nan NaN NaN NaN NaN NaN NaN NaN nan nan nan nan nan nan nan 0
1 nan 35808913 1 nan nan nan nan NaN NaN NaN NaN NaN NaN NaN nan nan nan nan nan nan nan 0
2 nan 36211484 272 nan nan nan nan NaN NaN NaN NaN NaN NaN NaN nan nan nan nan nan nan nan 0
3 Blood and lymphatic system disorders 788149 1 PT HLT HLGT SOC 10078097.0 10026847.0 10002086.0 10005329.0 35102369.0 35102033.0 35100000.0 Gelatinous transformation of the bone marrow Marrow depression and hypoplastic anaemias Anaemias nonhaemolytic and marrow depression Is a Is a Is a anatomic_site_disorder 0
4 Blood and lymphatic system disorders 788161 15 PT HLT HLGT SOC 10078117.0 10052828.0 10047954.0 10005329.0 35102448.0 35102049.0 35100000.0 Eosinophilic granulomatosis with polyangiitis Eosinophilic disorders White blood cell disorders Is a Is a Is a anatomic_site_disorder 0
5 Blood and lymphatic system disorders 788294 4 PT HLT HLGT SOC 10077465.0 10028578.0 10018865.0 10005329.0 35102379.0 35102036.0 35100000.0 Myeloproliferative neoplasm Myeloproliferative disorders (excl leukaemias) Haematopoietic neoplasms (excl leukaemias and lymphomas) Is a Is a Is a anatomic_site_disorder 0
6 Blood and lymphatic system disorders 788327 1 PT HLT HLGT SOC 10077533.0 10077528.0 10025320.0 10005329.0 788074.0 35102042.0 35100000.0 Marginal zone lymphoma recurrent Marginal zone lymphomas NEC Lymphomas non-Hodgkin's B-cell Is a Is a Is a anatomic_site_disorder 0
7 Blood and lymphatic system disorders 788452 1 PT HLT HLGT SOC 10077833.0 10043555.0 10035534.0 10005329.0 35102440.0 35102046.0 35100000.0 Congenital thrombocytopenia Thrombocytopenias Platelet disorders Is a Is a Is a anatomic_site_disorder 0
8 Blood and lymphatic system disorders 35104065 2 PT HLT HLGT SOC 10002043.0 10002042.0 10002086.0 10005329.0 35102366.0 35102033.0 35100000.0 Anaemia folate deficiency Anaemia deficiencies Anaemias nonhaemolytic and marrow depression Is a Is a Is a anatomic_site_disorder 0
9 Blood and lymphatic system disorders 35104066 7 PT HLT HLGT SOC 10066468.0 10002042.0 10002086.0 10005329.0 35102366.0 35102033.0 35100000.0 Anaemia of pregnancy Anaemia deficiencies Anaemias nonhaemolytic and marrow depression Is a Is a Is a anatomic_site_disorder 0
meddra_concept_name_4 meddra_concept_id neventreports meddra_concept_class_id_1 meddra_concept_class_id_2 meddra_concept_class_id_3 meddra_concept_class_id_4 meddra_concept_code_1 meddra_concept_code_2 meddra_concept_code_3 meddra_concept_code_4 meddra_concept_id_2 meddra_concept_id_3 meddra_concept_id_4 meddra_concept_name_1 meddra_concept_name_2 meddra_concept_name_3 relationship_id_12 relationship_id_23 relationship_id_34 soc_category pediatric_adverse_event
16931 Vascular disorders 45887483 1 PT HLT HLGT SOC 10074494.0 10019630.0 10014523.0 10047065.0 37604021.0 37602358.0 37600000.0 Hepatic vascular thrombosis Hepatic and portal embolism and thrombosis Embolism and thrombosis Is a Is a Is a anatomic_site_disorder 0
16932 Vascular disorders 46276587 14 PT HLT HLGT SOC 10076566.0 10038605.0 10003216.0 10047065.0 37604012.0 37602356.0 37600000.0 Perineal necrosis Reproductive system necrosis and vascular insufficiency Arteriosclerosis, stenosis, vascular insufficiency and necrosis Is a Is a Is a anatomic_site_disorder 0
16933 Vascular disorders 46276592 3 PT HLT HLGT SOC 10076575.0 10057188.0 10047066.0 10047065.0 37604038.0 37602360.0 37600000.0 Harlequin syndrome Site specific vascular disorders NEC Vascular disorders NEC Is a Is a Is a anatomic_site_disorder 0
16934 Vascular disorders 46276611 5 PT HLT HLGT SOC 10076605.0 10037401.0 10057166.0 10047065.0 37203755.0 37602362.0 37600000.0 Right-to-left cardiac shunt Pulmonary hypertensions Vascular hypertensive disorders Is a Is a Is a anatomic_site_disorder 0
16935 Vascular disorders 46276734 1 PT HLT HLGT SOC 10076931.0 10017984.0 10003216.0 10047065.0 37604006.0 37602356.0 37600000.0 Strangulated umbilical hernia Gastrointestinal necrosis and vascular insufficiency Arteriosclerosis, stenosis, vascular insufficiency and necrosis Is a Is a Is a anatomic_site_disorder 0
16936 Vascular disorders 46276768 1 PT HLT HLGT SOC 10076994.0 10008192.0 10003216.0 10047065.0 37604004.0 37602356.0 37600000.0 Lacunar stroke Cerebrovascular and spinal necrosis and vascular insufficiency Arteriosclerosis, stenosis, vascular insufficiency and necrosis Is a Is a Is a anatomic_site_disorder 0
16937 Vascular disorders 46276769 4 PT HLT HLGT SOC 10076999.0 10057181.0 10011954.0 10047065.0 37604017.0 37602357.0 37600000.0 Bezold-Jarisch reflex Vascular hypotensive disorders Decreased and nonspecific blood pressure disorders and shock Is a Is a Is a anatomic_site_disorder 0
16938 Vascular disorders 46276781 4 PT HLT HLGT SOC 10077023.0 10037401.0 10057166.0 10047065.0 37203755.0 37602362.0 37600000.0 Alveolar capillary dysplasia Pulmonary hypertensions Vascular hypertensive disorders Is a Is a Is a anatomic_site_disorder 1
16939 Vascular disorders 46276809 7 PT HLT HLGT SOC 10077110.0 10047067.0 10047066.0 10047065.0 37604032.0 37602360.0 37600000.0 Vein rupture Non-site specific vascular disorders NEC Vascular disorders NEC Is a Is a Is a anatomic_site_disorder 0
16940 Vascular disorders 46276815 1 PT HLT HLGT SOC 10077143.0 10029558.0 10003216.0 10047065.0 37604009.0 37602356.0 37600000.0 Vascular stent occlusion Non-site specific necrosis and vascular insufficiency NEC Arteriosclerosis, stenosis, vascular insufficiency and necrosis Is a Is a Is a anatomic_site_disorder 0